Exploiting Locality in Sparse Matrix-Matrix Multiplication on Many-Core Architectures

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-threaded Sparse Matrix-Matrix Multiplication for Many-Core and GPU Architectures

Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrixmatrix multiplication with a focus on performance portability across different high performance computing archite...

متن کامل

Accelerating Sparse Matrix Vector Multiplication on Many-Core GPUs

Many-core GPUs provide high computing ability and substantial bandwidth; however, optimizing irregular applications like SpMV on GPUs becomes a difficult but meaningful task. In this paper, we propose a novel method to improve the performance of SpMV on GPUs. A new storage format called HYB-R is proposed to exploit GPU architecture more efficiently. The COO portion of the matrix is partitioned ...

متن کامل

Exploiting Locality on Parallel Sparse Matrix Computations

By now, irregular problems are di cult to parallelize in an automatic way because of their lack of regularity in data access patterns. Most times, programmers must hand-write a particular solution for each problem separately. In this paper we present two pseudo-regular distributions which can be applied to partition most problems achieving very good average case distributions. Also, we have des...

متن کامل

cient Sparse Matrix - Matrix Multiplication on Multicore Architectures ⇤

We describe a new parallel sparse matrix-matrix multiplication algorithm in shared memory using a quadtree decomposition. Our implementation is nearly as fast as the best sequential method on one core, and scales quite well to multiple cores.

متن کامل

Efficient Sparse Matrix-Matrix Multiplication on Multicore Architectures∗

We describe a new parallel sparse matrix-matrix multiplication algorithm in shared memory using a quadtree decomposition. Our preliminary implementation is nearly as fast as the best sequential method on one core, and scales well to multiple cores.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Parallel and Distributed Systems

سال: 2017

ISSN: 1045-9219

DOI: 10.1109/tpds.2017.2656893